Implementation of Preprocessing Techniques in Datamining

نویسنده

  • A. Abdullah
چکیده

carefully screened can produce misleading results. Thus, the raw data needs to pre-process before doing data mining. And often-times, this step can take considerable amount of processing time. Usually, data from experiments are not suitable for doing data mining tasks. Because of the raw data may contain out-ofrange-values, impossible data combination or missing value etc. Analyzing data without being Data preprocessing includes cleaning, normalization, transformation, feature selection and extraction etc. The product of data pre-processing is the final training data set. In our research, we do discretization, calculating similarity or distance between objects, normalization, and find a correlation between objects or attributes in a data set to gain better analyze before main pre-processing steps. Keywords— Discretization; Correlation; Normalization; Euclidean distance; Cosine similarity

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigation of effective factors in expanding electronic payment in Iran using datamining techniques

E-banking has grown dramatically with the development of ICT industry and banks offer their services to customers from different channels. Nowadays, considering the great economic benefits of electronic banking systems, the need to pay attention to the expansion of electronic banking is increasingly felt in terms of reducing costs and increasing the bank's profitability. The purpose of this stu...

متن کامل

Behavioral Analysis of Traffic Flow for an Effective Network Traffic Identification

Fast and accurate network traffic identification is becoming essential for network management, high quality of service control and early detection of network traffic abnormalities. Techniques based on statistical features of packet flows have recently become popular for network classification due to the limitations of traditional port and payload based methods. In this paper, we propose a metho...

متن کامل

Discrimination of Golab apple storage time using acoustic impulse response and LDA and QDA discriminant analysis techniques

ABSTRACT- Firmness is one of the most important quality indicators for apple fruits, which is highly correlated with the storage time. The acoustic impulse response technique is one of the most commonly used nondestructive detection methods for evaluating apple firmness. This paper presents a non-destructive method for classification of Iranian apple (Malus domestica Borkh. cv. Golab) according...

متن کامل

Combining domain knowledge and data in datamining systems

Despite the predominant attention for analysis in the datamining literature, data selection and preprocessing have a substantial inuence on the success of data mining projects. Since a database is always an imperfect description of a real business process, there are numerous problems to overcome. If the description of the domain is too limited, essential patterns in the environment may not have...

متن کامل

Using Datamining Techniques to Help Metaheuristics: A Short Survey

Hybridizing metaheuristic approaches becomes a common way to improve the efficiency of optimization methods. Many hybridizations deal with the combination of several optimization methods. In this paper we are interested in another type of hybridization, where datamining approaches are combined within an optimization process. Hence, we propose to study the interest of combining metaheuristics an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014